figure 11
endfor Updatecriticwithฯi ฯi ฮฑฯ ฯiLi Updateactoriwithฮธi ฮธi+ฮฑฮธ ฮธi JiPG+ฮป1 PN j=1J i,j TS
We trained each agent i with online Q-learning [33] on the Qi(ai,s) table using Boltzmann exploration [18]. The Boltzmann temperature is fixed to 1 and we set the learning rate to 0.05 and the discount factor to0.99. Atinitialisation,thetarget'sand ball'svertical position is fixed, their horizontal positions are random. In all of our experiments, we use the Adam optimizer [19] to perform parameter updates. We use a buffer-size of106 entriesandabatch-sizeof1024.
Unsupervised Semantic Correspondence Using Stable Diffusion 483 Supplementary Material 484 In this supplementary material we: 485 provide per-category quantitative results on SPair71k dataset; 486
U-Net layers: Randomly selected within the range of 7 to 15. Learning rate for prompt optimization:: A random value between 0.01 and 5e-4 was chosen for Noise level: Randomly chosen within the range t =1 to t = 10, where T = 50. Number of optimization steps: Randomly chosen in the range of 100 to 300. Learning rate for prompt optimization: 2. 37 10 Image crop size: Crop size as a percentage of the original image is 93. The architecture in Figure 3 is based on the stable diffusion model version 1.4 [ Correct correspondences are indicated in blue, while incorrect ones are depicted in orange. Correct correspondences are indicated in blue, while incorrect ones are depicted in orange.
PredictingTrainingTimeWithoutTraining SupplementaryMaterial
In both cases we observe that the predicted curve is reasonably close to the actual curve, more so at the beginning of the training (which is expected, sincethelinearapproximation ismorelikelytohold). Point-wise similarity of predicted and observed loss curve. Up to now we focused on prediction error rates (see e.g. We started defining training time as the first time the (smoothed) loss is belowagiventhreshold(whichwethennormalizedw.r.t. In Section 4we suggest that, in the case of MSE loss, itispossible to predict the training time on alargedataset using asubset ofthesamples. However,sinceourtraining time definition measures the time to reach the asymptotic value (which is what is most useful in practice) rather than the time reach an absolute threshold, this does not affect the accuracy of the prediction(seeAppendixC).